What we did?

  • We analyzed cuts at the power plants in Turkey between 2012-2018.
  • We had in total 73313 observations with 8 variables
  • We mutated new observations from the existing ones: Plant.Type, Duration of Cut, Capacity Ratio at the cut and reason of the cut.
  • We tidied the raw data using regular expressions and stringr package.
  • We used tidy text mining to analyze count of words and which word is following which word.
  • We divided cuts into two category, Malfunctions and Planned Activities and looked for their distributions.
  • We looked at differences between malfunctions and planned activities in terms of duration of the cut.
  • We looked at malfunction types, malfunction reasons and durations according to plant type.

Cuts At Power Plants in Turkey(2012-2018)

Yearly Incidents are way higher at 2018.

Glimpse of Cleaning

It was not easy

cuts$Plant.Name <- cuts$Plant.Name %>% 
  str_replace_all("^rwe_turcas_guney", "denizli rwe_turcas_guney") %>%
  str_replace_all("tekirda?? santrali.*", "modern enerji tekirda?? santrali") %>%
  str_replace_all("karada??$", "karada?? res") %>%
  str_replace_all(".?menzelet( hes)?", "menzelet hes") %>%
  str_replace_all("\\.", "") %>%
  str_replace_all("hidro(\\s?elektrik santral[yi]| e\\.?s)", " hes") %>%
  str_replace_all("(termik santral[yi]|\\sts\\s?)", " tes") %>%
  str_replace_all("tunçbilektes", "tunçbilek tes") %>%
  str_replace_all("d.*(k.*)?ç.*(s.*)?", "DGKC") %>%
  str_replace_all("jeotermal (e.*s.*)", "jes")

```

Overview of Plant Categories

We've categorised power plants by their type, doing analysis by plant name would not yield much useful results.

  • HES: Hydroelectricity Plant

  • TES: Thermal Energy Plant

  • RES: Wind Energy Plant(Wind Turbines)

  • DGKC: Natural Gas Combined Cycle Plant

  • JES: Geothermal Energy Plant

Overview of Plant Categories-cont'd.

Cut Reason by Text Mining

Cut Reason by Plant

Shutdown Reason by Category

Cut Reason by Category

Conclusions

  • Most time consuming part was data transformation and cleaning. We were able to categorise 85% percent of plants and %70 of malfunctions by text mining.
  • Planned activities were fewer in number but took much longer than malfunctions.
  • Especially in 2018, number of data entries have drastically increased.
  • While in average Thermal plants produce higher amounts of power, on total Hydroelectric plants' throughput is the highest.
  • At certain years, some of the plants reported much more malfunction than others, most of these plants were Thermal plants. Maybe because they were older?
  • Each type of plant have a different leading reason for shutdowns. Majority of the total causes were categorised as rotating equipment failure. Mostly from turbines.

Thanks

      ```